Nonlinear I: Threshold Autoregressive Models

Sandeep Sarangi

For convenience these models are oftentimes simply refered to as TAR models. These models capture behaviors that can not be captured by linear time series models such as limit cycles, amplitude dependent frequencies, and jump phehomena. Using the threshold model, Tong and Lim (1980), showed this model is able to produe asymmetric, periodic behavior that arises in Wolf's annual sunspot data and Canadian lynx data.

An example of a first-order TAR model:

\begin{align*} Y_t & = \begin{cases} \phi_{1,0} + \phi_{1,1}Y_{t-1} + \sigma_{1}e_{t}, & Y_{t-1}\le r\\ \phi_{2,0} + \phi_{2,1}Y_{t-1} + \sigma_{2}e_{t}, & Y_{t-1}\gt r\\ \end{cases} \\ \end{align*}

The $\phi$'s are autoregressive parameters, the $\sigma$'s are noise standard deviations, $Y_{t-1}$ the threshold variable, $r$ is the threshold parameter, and $\{e_{t}\}$ is a sequence of iid random variables with zero mean and unit variance.

Each linear submodel is referred to as a regime. The above is a two-regime model.

Idea: The process switches between two linear mechanisms depending on the position of the lag 1 value of the process.

Consider the following simple first-order TAR model:

\begin{align*} Y_t & = \begin{cases} 0.3 + 0.5Y_{t-1} + 1e_{t}, & Y_{t-1}\le -1\\ -0.2 - 1.8Y_{t-1} + 1e_{t}, & Y_{t-1}\gt -1\\ \end{cases} \\ \end{align*}
In [163]:
library(TSA)
set.seed(12349)
#Lower regime parameters
i1 = 0.3
p1 = 0.5
s1 = 1

#Higher regime parameters
i2 = -0.2
p2 = -1.8
s2 = 1

thresh = -1
delay = 1

#Simulate 100 obs of the above model
y=tar.sim(n=100,Phi1=c(i1,p1),Phi2=c(i2,p2),p=1,d=delay,sigma1=s1,thd=thresh,sigma2=s2)$y

#Plot the data series
plot(y=y,x=1:length(y),type='o',xlab='t',ylab=expression(Y[t]),main="Simulated TAR(2,1,1) model with delay 1")
abline(thresh,0,col="red")

The skeleton of the TAR model is a modified version of the original TAR model. It is obtained by suppressing the noise terms and the intercepts and setting the threshold to 0:

\begin{align*} Y_t & = \begin{cases} 0.5Y_{t-1}, & Y_{t-1}\le 0\\ -1.8Y_{t-1}, & Y_{t-1}\gt 0\\ \end{cases} \\ \end{align*}

Stability of the skeleton along with some regularity conditions implies the stationarity of the TAR. Stability can be understood as meaning that for any initial value $Y_{1}$ the skeleton is a bounded process, i.e. you examine the behavior of $Y_{t}$ as t $\rightarrow \infty$.

In [164]:
#Check the stability of skeleton using different starting points
startvals = c(-2, -1.1,-0.5, 0.8, 1.2, 3.4)
#startvals = seq(-6,6, by=0.1)
numsim=20
x <- matrix(data=0,nrow=length(startvals),ncol=numsim)
ysk <- vector(, numsim);

count = 1
for (s in startvals) {
	ysk[1] = s
	for (i in 2:numsim) {
		if (ysk[i-1] <= 0) {
			ysk[i] = 0.5*ysk[i-1]
		} else {
			ysk[i] = -1.8*ysk[i-1]
		}
	}
	x[count,] <- ysk
	count = count + 1
}

#Plot the different realizations of the skeleton
matplot(t(x),type="l",xlab='t',ylab=expression(Y[t]), main="Stability of skeleton for different starting values")
abline(0,0)

It's been shown, Chan and Tong (1985), that the first-order TAR model is stationary if

$$ \phi_{1,1} < 1, \phi_{2,1} < 1, \phi_{1,1}\phi_{2,1} < 1 $$

The general two-regime model is written as:

\begin{align*} Y_t & = \begin{cases} \phi_{1,0} + \phi_{1,1}Y_{t-1} + ... + \phi_{1,p_{1}}Y_{t-p_{1}} + \sigma_{1}e_{t}, & Y_{t-d}\le r\\ \phi_{2,0} + \phi_{2,1}Y_{t-1} + ...+ \phi_{2,p_{2}}Y_{t-p_{2}} + \sigma_{2}e_{t}, & Y_{t-d}\gt r\\ \end{cases} \\ \end{align*}

and is referred to as a $TAR(2;p_{1},p_{2})$ model with delay d. It's typically assumed that $p_1=p_2=p$ and $1\le d\le p$.

Stabilty of the skeleton in this case is more complicated and the necessary and sufficient conditions for stationarity of this TAR model are not known. However it's been shown, Chan and Tong (1985), that the above TAR model is stationary if

$$ |\phi_{1,1}| + ... + |\phi_{1,p}| < 1, |\phi_{2,1}| + ... + |\phi_{2,p}| < 1 $$

Model Estimation

One approach, and the approach discussed here, is the conditional least squares (CLS) approach.

For simplicity, in addition to assuming $p_{1} = p_{2} = p, 1 \le d \le p$, assume that $\sigma_{1} = \sigma_{2} = \sigma$. Then the TAR model can conveniently be written as

$$ Y_{t} = \{\phi_{1,0} + \phi_{1,1}Y_{t-1} + ... + \phi_{1,p}Y_{t-p}\}(1 - I(Y_{t-d} \gt r)) + \{\phi_{2,0} + \phi_{2,1}Y_{t-1} + ... + \phi_{2,p}Y_{t-p}\}I(Y_{t-d} \gt r) + \sigma e_{t} $$

where $I(Y_{t-d} > r) = 1$ if $Y_{t-d} \gt r$ and 0 otherwise. CLS minimizes the conditional residual sum of squares:

$$ L(r,d) = \sum_{t = p+1}^{n} (Y_{t} - \phi_{1,0} - \phi_{1,1}Y_{t-1} - ... - \phi_{1,p}Y_{t-p})^{2} (1 - I(Y_{t-d} \gt r))\\ + (Y_{t} - \phi_{2,0} - \phi_{2,1}Y_{t-1} - ... - \phi_{2,p}Y_{t-p})^{2} I(Y_{t-d} \gt r) $$

Case 1. If both r and d are known.

In this case, you can partition the data into two parts according to whether $Y_{t-d} \le r$ and then perform OLS to estimate the parameters of each linear submodel.

Case 2. If r is unknown.

Do a search over a range of values of $r$ which must be somewhere between the smallest and largest value of the time series to guarantee that the series actually crosses the threshold. Exclude the highest and lowest 10 percent of values from the search then

  1. Estimate a TAR model for different values of $r=y_{t}$ within this restricted band.
  2. Choose the value of $r$ resulting in the smallest residual sum of squares for the corresponding regression model.
In [165]:
#find percentiles
lq = quantile(y,0.10)
uq = quantile(y,0.90)

#Plot the data series
plot(y=y,x=1:length(y),type='o',xlab='t',ylab=expression(Y[t]),main="Simulated TAR(2,1,1) model with delay 1")
abline(lq,0,col="blue")
abline(uq,0,col="blue")
In [166]:
#Number of models estimated
sum( (lq <= y ) & (y <= uq) )
80

Case 3. If d is unknown.

Let $d$ take on possible values of 1,2,3,...,$p$. Estimate a TAR model for each potential value of $d$ and then choose the one with the smallest residual sum of squares.

It has been shown, Chan (1993), that under mild conditions the CLS method is consistent.

Minimum AIC (MAIC) approach.

Since in practice the AR orders of the two regimes are not known a method is needed that allows for estimation of these as well. For a TAR model, for fixed $r$ and $d$, the AIC becomes

$$ AIC(p_{1},p_{2}, r,d) = -2l(r,d) + 2(p_{1} + p_{2} + 2) $$

Then estimate parameters by minimizing the AIC subject to searching the threshold parameter over some interval such that any regimes have sufficient data for estimation.

In [167]:
#Estimate model
#If threshold and delay are known
#model.tar.wth = tar(y, p1 = 1, p2 = 1, d = 1, threshold=-1)

#If threshold is not known
#model.tar = tar(y, p1 = 1, p2 = 1, d = 1, a = 0.10, b = 0.90)

#MAIC approach
AICM = NULL
model.best <- list(d=0, p1=0, p2=0)
AIC.best = 1000

for (d in 1:3) {
	model.tar.s = tar(y, p1 = 3, p2 = 3, d = d, a=0.15, b=0.85)
	if (model.tar.s$AIC < AIC.best) {
		AIC.best = model.tar.s$AIC
		model.best$d = d
		model.best$p1 = model.tar.s$p1
		model.best$p2 = model.tar.s$p2 
	}
	AICM = rbind(AICM, c(d, model.tar.s$AIC, signif(model.tar.s$thd,4), model.tar.s$p1,model.tar.s$p2))}
colnames(AICM) = c('d', 'nominal AIC', 'r', 'p1', 'p2')
rownames(AICM) = NULL
AICM
dnominal AICrp1p2
1 311.2 -1.00201 1
2 372.6 0.22181 2
3 388.4 -1.38701 0

Tests for Nonlinearity

1. Visual inspection using lagged regression plots.

Plot $Y_{t}$ against its lags. Fitted regression curves that are not fairly straight would indicate possible existence of a nonlinear relationship.

In [168]:
lagplot(y)

2. Keenan's test:

Consider the following model motivated by a second-order Volterra expansion:

$$ Y_{t} = \theta_{0} + 1 + \phi_{1}Y_{t-1} + ... + \phi_{m}Y_{t-m} + \eta(\sum_{j=1}^{m}\phi_{j}Y_{t-j})^{2} + \epsilon_{t} $$

where $\{\epsilon_{t}\}$ are iid normally distributed with zero mean and finite variance. If $\eta = 0$, the model becomes an AR($m$) model.

It can be shown that Keenan's test is equivalent to testing $\eta = 0$ in the regression model:

$$ Y_{t} = \theta_{0} + \phi_{1}Y_{t-1} + ... + \phi_{m}Y_{t-m} + \eta \hat{Y}_{t}^{2} + \epsilon_{t} $$

where $\hat{Y_{t}}$ are the fitted values from regressing $Y_{t}$ on $Y_{t-1},...,Y_{t-m}$.

3. Tsay's test:

A more general alternative to Keenan's test. Replace the term $\eta(\sum_{j=1}^{m}\phi_{j}Y_{t-j})^{2}$ in the above model given for Keenan's test by a more complicated expression. End up performing an F-test on a quadratic regression model of whether all nonlinear terms are zero.

In [169]:
#Check for non-lineary using formal tests: Keenan, Tsay
#Null is an AR model of order 1
Keenan.test(y,1)
$test.stat
90.2589565661567
$p.value
1.76111433596097e-15
$order
1
In [170]:
Tsay.test(y,1)
$test.stat
71.34
$p.value
3.201e-13
$order
1

4. Testing for threshold nonlinearity

This is a likelhood ratio based test.

The null hypothesis is an AR($p$) model; the alternative hypothesis is a two-regime TAR model of order $p$ with constant noise variance, i.e. $\sigma_{1} = \sigma_{2} = \sigma$. Using these assumptions, the general model can be rewritten as

$$ Y_{t} = \{\phi_{1,0} + \phi_{1,1}Y_{t-1} + ... + \phi_{1,p}Y_{t-p}\} + \{\phi_{2,0} + \phi_{2,1}Y_{t-1} + ... + \phi_{2,p}Y_{t-p}\}I(Y_{t-d} \gt r) + \sigma e_{t} $$

and the null hypothesis states that $\phi_{2,0} = \phi_{2,1} = ... = \phi_{2,p} = 0$.

The test is carried out for fixed order $p$ and fixed delay $d$. The likelihood ratio test statistic can be shown to be equivalent to

$$ T_{n} = (n-p) log \Big\{\frac{\hat{\sigma}^2(H_{0})}{\hat{\sigma}^2(H_{1})}\Big\} $$

where $n - p$ is the effective sample size, $\hat{\sigma}^2(H_{0})$ is the MLE of the noise variance from the linear AR($p$) fit and $\hat{\sigma}^2(H_{1})$ the MLE of the noise variance from the TAR fit with the threshold value searched over some finite interval.

The sampling distribution of the likelihood ratio test under $H_{0}$ has a nonstandard sampling distribution; see Chan (1991) and Tong (1990).

In [171]:
res = tlrt(y, p=1, d=1, a=0.15, b=0.85)
res
$percentiles
  1. 14.1
  2. 85.9
$test.statistic
: 142.291963130459
$p.value
: 0

Model Diagnostics

Model diagnostics are done using residual analysis. The residuals for a TAR model are defined as

$$ \hat{\epsilon}_{t} = Y_{t} - \{\hat{\phi}_{1,0} + \hat{\phi}_{1,0}Y_{t-1} + ... + \hat{\phi}_{1,p}Y_{t-p}\} I(Y_{t-\hat{d}} \le \hat{r}) - \{\hat{\phi}_{2,0} + \hat{\phi}_{2,0}Y_{t-1} + ... + \hat{\phi}_{2,p}Y_{t-p}\} I(Y_{t-\hat{d}} \gt \hat{r}) $$

The standardized residuals are the raw residuals standardized by their appropriate standard deviations:

$$ \hat{e}_{t} = \frac{\hat{\epsilon}_{t}}{\hat{\sigma_{1}}I(Y_{t-\hat{d}} \le \hat{r}) + \hat{\sigma_{2}}I(Y_{t-\hat{d}} \gt \hat{r})} $$

A plot of the standardized residuals should look random if the TAR model is the true data mechanism. The independence assumption of the standardized errors can be checked by examining the sample ACF of the standardized residuals.

Another test for model specification can be carried out using the quantity

$$ B_{m} = n_{eff}\sum_{i=1}^{m}\sum_{j=1}^{m} q_{i,j}\hat{\rho}_{i}\hat{\rho}_{j} $$

where $n_{eff}$ = n - $max(p_{1},p_{2},d)$ is the effective sample size, $\hat{\rho_{i}}$ the ith-lag sample autocorrelation of the standardized residuals, and $q_{i,j}$ some model-dependent constants.

$B_{m}$ is approximately distributed as Chi-squared with $m$ degrees of freedom.

Idea: If the true model is a TAR model, $\hat{\rho}{i}$ are likely close to zero and hence $B{m}$ should be as well; a large value for $B{m}$ is suggests an incorrect model specification._

For this test, a plot of $B_{m}$ over a range of $m$ values is examined.

In [172]:
#Model diagnostics
#win.graph(width=4.875, height=4.5)
model.tar.best = tar(y, p1 = model.best$p1, p2 = model.best$p2, d = model.best$d, a = 0.15, b = 0.85)
#model.tar.best = tar(y, p1 = 1, p2 = 1, d = 1, a = 0.15, b = 0.85)
tsdiag(model.tar.best, gof.lag=20)

Forecasting

Predictive distributions are often nonnormal and intractable. Typically, a simulated approach is adopted for forecasting. Consider the model

$$ Y_{t+1} = h(Y_{t}, e_{t+1}) $$

Then given $Y_{t} = y_{t}, Y_{t-1} = y_{t-1}$, ... , you have

$$ Y_{t+1} = h(y_{t}, e_{t+1}) $$

So a realization of $Y_{t+1}$ from the one-step ahead predictive distribution can be obtained by drawing $e_{t+1}$ from the error distribution and computing $h(y_{t},e_{t+1}).$

By repeating this proceure independently B times you get a random sample of B values from the one-step ahead predictive distributon.

The one-step ahead predictive mean can be estimated by the sample mean of these B values. A 95% prediction interval can be constructed by the interval defined by the 2.5th percentile and 97.5th percentile of the simulated B values.

The simulation approach can be easily extended to finding the $l$-step ahead predictive distribution for any integer $l \ge 2$ by iterating:

$$ Y_{t+1} = h(Y_{t}, e_{t+1}) \\ Y_{t+2} = h(Y_{t+1}, e_{t+2}) \\ . \\ . \\ . \\ Y_{t+l} = h(Y_{t+l-1}, e_{t+l}) $$

where $Y_{t} = y_{t}$ and ${e_{t+1}, e_{t+2},...,e_{t+l}}$ is random sample of $l$ values drawn from the error distribution.

In [173]:
#Forecasting
model.tar.pred <- predict(model.tar.best, n.ahead = 10, n.sim=1000)
y.pred = ts(c(y, model.tar.pred$fit), frequency=1, start=start(y))
plot(y.pred, type='n', ylim=range(c(y.pred,model.tar.pred$pred.interval)),ylab=expression(Y[t]), xlab='t')
lines(y)
lines(window(y.pred, start=end(y) + c(0,1)), lty=2)
lines(ts(model.tar.pred$pred.interval[2,], start=end(y) + c(0,1), freq=1), lty=2)
lines(ts(model.tar.pred$pred.interval[1,], start=end(y) + c(0,1), freq=1), lty=2)
abline(-1,0,col="red")

Data Example

The time series modeled here is the yearly number of sunspots from 1700 to 1988.

In [174]:
#DATA SET
#Sunspot series, Annual

plot.ts(sunspot.year, xlab="Time", ylab="Yearly numbers of sunspots")
In [175]:
#Check for non-linearity via lag regression plot
#win.graph(width=4.875,height=6.5,pointsize=8)
lagplot(sunspot.year)
In [176]:
#Check for linearity using hypothesis tests
Keenan.test(sunspot.year)
Tsay.test(sunspot.year)
$test.stat
18.2840758932705
$p.value
2.64565849317573e-05
$order
9
$test.stat
3.904
$p.value
6.689e-12
$order
9
In [177]:
#Use MAIC method
AICM = NULL
for (d in 1:9) {
	sunspot.tar.s = tar(sunspot.year, p1 = 9, p2 = 9, d = d, a=0.15, b=0.85)
	AICM = rbind(AICM, c(d, sunspot.tar.s$AIC, signif(sunspot.tar.s$thd,4), sunspot.tar.s$p1, sunspot.tar.s$p2))}
colnames(AICM) = c('d', 'nominal AIC', 'r', 'p1', 'p2')
rownames(AICM) = NULL
AICM
dnominal AICrp1p2
1 228522.76 9
2 224841.09 9
3 222631.57 9
4 225147.88 7
5 229684.89 3
6 229119.88 9
7 227243.99 9
8 224448.59 2
9 222147.59 3
In [178]:
#Testing for threshold nonlinearity
res = tlrt(sunspot.year, p=9, d=9, a=0.15, b=0.85)
res
$percentiles
  1. 15
  2. 85
$test.statistic
: 52.2571950943405
$p.value
: 6.8337179274236e-06
In [179]:
#Model diagnostics
sunspot.tar.best <- tar(sunspot.year, p1 = 9, p2 = 3, d = 9, a = 0.15, b = 0.85)
tsdiag(sunspot.tar.best)
Error in diag(eigv) %*% t(eigm): non-conformable arguments
Traceback:

1. tsdiag(sunspot.tar.best)
2. tsdiag.TAR(sunspot.tar.best)
3. box.ljung(std.res, dxy1, dxy2, nlag = i, is.print = FALSE)
In [180]:
#Forecasting
sunspot.tar.pred <- predict(sunspot.tar.best, n.ahead = 10, n.sim=1000)
sunspot.pred = ts(c(sunspot.year, sunspot.tar.pred$fit), frequency=1, start=start(sunspot.year))
plot(sunspot.pred, type='n', ylim=range(c(sunspot.pred, sunspot.tar.pred$pred.interval)), ylab="Annual", xlab="Time")
lines(sunspot.year)
lines(window(sunspot.pred, start=end(sunspot.year) + c(0,1)), lty=2)
lines(ts(sunspot.tar.pred$pred.interval[2,], start=end(sunspot.year) + c(0,1), freq=1), lty=2)
lines(ts(sunspot.tar.pred$pred.interval[1,], start=end(sunspot.year) + c(0,1), freq=1), lty=2)
In [181]:
#fit linear AR model
#pacf(sunspot.year)
#try AR order 9
ord = 9
ar.mod <- arima(sunspot.year, order=c(ord,0,0), method="CSS-ML")
ar.fitted <- sunspot.year - ar.mod$res
sunspot.tar.fitted <- sunspot.year[(ord+1):length(sunspot.year)] - sunspot.tar.best$res

plot.ts(sunspot.year[10:289], xlab="Time",ylab="Yearly numbers of sunspots", main="Comparison of models")
lines(sunspot.tar.fitted, col="blue", lty=2)
lines(ar.fitted[10:289],col="red", lty=3)
legend('topleft', c("Actual","TAR","AR(9)"),lty=c(1,2,3), col=c("black","blue","red"))

AR performance on Simulated TAR models

Example 1. Fitting an AR(4) to the TAR model

\begin{align*} Y_t & = \begin{cases} 0.3 + 0.5Y_{t-1} + 1e_{t}, & Y_{t-1}\le -1\\ -0.2 - 1.8Y_{t-1} + 1e_{t}, & Y_{t-1}\gt -1\\ \end{cases} \\ \end{align*}
In [182]:
set.seed(12349)
#Lower regime parameters
i1 = 0.3
p1 = 0.5
s1 = 1

#Higher regime parameters
i2 = -0.2
p2 = -1.8
s2 = 1

thresh = -1
delay = 1

nobs = 200
#Simulate 200 obs of the above model
y=tar.sim(n=nobs,Phi1=c(i1,p1),Phi2=c(i2,p2),p=1,d=delay,sigma1=s1,thd=thresh,sigma2=s2)$y

#Use Tsay's test to deterimine order of best AR fit
ord <- Tsay.test(y)$order

#fit linear AR model
#pacf(sunspot.year)
#try AR order 4
ar.mod <- arima(y, order=c(ord,0,0), method="CSS-ML")
ar.fitted <- y - ar.mod$res

plot.ts(y[(ord+1):nobs], xlab="Time",ylab="Value", main="Fitted AR")
lines(ar.fitted[(ord+1):nobs],col="red", lty=3)
legend('topleft', c("Simulated","AR(4)"),lty=c(1,3), col=c("black","red"))
abline(thresh,0,col="blue")

Example 2. Fit an AR(4) to the TAR model

\begin{align*} Y_t & = \begin{cases} 0.3 + 0.5Y_{t-1} + 1e_{t}, & Y_{t-1}\le -1\\ -0.2 - 1.8Y_{t-1} + 2e_{t}, & Y_{t-1}\gt -1\\ \end{cases} \\ \end{align*}
In [183]:
#Lower regime parameters
i1 = 0.3
p1 = 0.5
s1 = 1

#Higher regime parameters
i2 = -0.2
p2 = -1.8
s2 = 2

thresh = -1
delay = 1

nobs = 200
#Simulate 200 obs of the above model
y=tar.sim(n=nobs,Phi1=c(i1,p1),Phi2=c(i2,p2),p=1,d=delay,sigma1=s1,thd=thresh,sigma2=s2)$y

#Use Tsay's test to deterimine order of best AR fit
ord <- Tsay.test(y)$order

#fit linear AR model
#pacf(sunspot.year)
#try AR order 4
ar.mod <- arima(y, order=c(ord,0,0), method="CSS-ML")
ar.fitted <- y - ar.mod$res

plot.ts(y[(ord+1):nobs], xlab="Time",ylab="Value", main="Fitted AR")
lines(ar.fitted[(ord+1):nobs],col="red", lty=3)
legend('topleft', c("Simulated","AR(4)"),lty=c(1,3), col=c("black","red"))
abline(thresh,0,col="blue")

Example 3. Fit an AR(3) to the TAR model

\begin{align*} Y_t & = \begin{cases} 0.3 + 0.5Y_{t-1} + 1e_{t}, & Y_{t-1}\le 0\\ -0.2 - 1.8Y_{t-1} + 2e_{t}, & Y_{t-1}\gt 0\\ \end{cases} \\ \end{align*}
In [184]:
#Lower regime parameters
i1 = 0.3
p1 = 0.5
s1 = 1

#Higher regime parameters
i2 = -0.2
p2 = -1.8
s2 = 2

thresh = 0
delay = 1

nobs = 200
#Simulate 200 obs of the above model
y=tar.sim(n=nobs,Phi1=c(i1,p1),Phi2=c(i2,p2),p=1,d=delay,sigma1=s1,thd=thresh,sigma2=s2)$y

#Use Tsay's test to deterimine order of best AR fit
ord <- Tsay.test(y)$order

#fit linear AR model
#pacf(sunspot.year)
#try AR order 3
ar.mod <- arima(y, order=c(ord,0,0), method="CSS-ML")
ar.fitted <- y - ar.mod$res

plot.ts(y[(ord+1):nobs], xlab="Time",ylab="Value", main="Fitted AR")
lines(ar.fitted[(ord+1):nobs],col="red", lty=3)
legend('topleft', c("Simulated","AR(3)"),lty=c(1,3), col=c("black","red"))
abline(thresh,0,col="blue")

Example 3. Fit an AR(7) to the TAR model

\begin{align*} Y_t & = \begin{cases} 0.3 + 0.5Y_{t-1} + 1e_{t}, & Y_{t-3}\le 0\\ -0.2 - 1.2Y_{t-1} + 1e_{t}, & Y_{t-3}\gt 0\\ \end{cases} \\ \end{align*}
In [185]:
#Lower regime parameters
i1 = 0.3
p1 = 0.5
s1 = 1

#Higher regime parameters
i2 = -0.2
p2 = -1.2
s2 = 1

thresh = 0
delay = 3

nobs = 200
#Simulate 200 obs of the above model
y=tar.sim(n=nobs,Phi1=c(i1,p1),Phi2=c(i2,p2),p=1,d=delay,sigma1=s1,thd=thresh,sigma2=s2)$y

#Use Tsay's test to deterimine order of best AR fit
ord <- Tsay.test(y)$order

#fit linear AR model
#pacf(sunspot.year)
#try AR order 7
ar.mod <- arima(y, order=c(ord,0,0), method="CSS-ML")
ar.fitted <- y - ar.mod$res

plot.ts(y[(ord+1):nobs], xlab="Time",ylab="Value", main="Fitted AR")
lines(ar.fitted[(ord+1):nobs],col="red", lty=3)
legend('topleft', c("Simulated","AR(7)"),lty=c(1,3), col=c("black","red"))
abline(thresh,0,col="blue")

References

Chan, K.S. 1993. "Consistency and limiting distribution of the least squares estimator of a threshold autorgressive model." Annals of Statistics, 21, 1, 520-533.

Chan, K.S. and Cryer, J. 2010. Time Series Analysis: With Applications in R. Springer.

Chan, K.S. and Tong, H. 1985. "On the use of the deterministic Lyapunov function for the ergodicity of stochastic difference equations." Adv. Appl. Prob., 17, 666-678.

Enders, W. 2010. Applied Econometric Time Series. John Wiley & Sons.

Tong, H., and Lim, K.S. 1980. "Threshold Autoregression, Limit Cycles, and Cyclical Data" (with discussion). Journal of the Royal Statistical Society, Ser. B, 42, 245-292.

Tsay, Ruey. "Testing and Modeling Multivariate Threshold Models." Journal of the American Statistical Association 93 (1998), 1188 - 1202.